AITopics | use case

Rebuilding the data stack for AI

MIT Technology ReviewApr-27-2026, 13:00:00 GMT

Enterprise AI hinges on high-accuracy outputs, requiring better data context, unified architectures, and rigorous measurement frameworks, says Bavesh Patel, senior vice president at Databricks, and Rajan Padmanabhan, unit technology officer at Infosys. Artificial intelligence may be dominating boardroom agendas, but many enterprises are discovering that the biggest obstacle to meaningful adoption is the state of their data. While consumer-facing AI tools have dazzled users with speed and ease, enterprise leaders are discovering that deploying AI at scale requires something far less glamorous but far more consequential: data infrastructure that is unified, governed, and fit for purpose. That gap between AI ambition and enterprise readiness is becoming one of the defining challenges of this next phase of digital transformation. As Bavesh Patel, senior vice president of Databricks, puts it, "the quality of that AI and how effective that AI is, is really dependent on information in your ...

artificial intelligence, databrick, social media, (17 more...)

MIT Technology Review

Industry: Information Technology > Services (0.35)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.68)

Add feedback

Synthcity: a benchmark framework for diverse use cases of tabular synthetic data

Neural Information Processing SystemsApr-24-2026, 13:53:54 GMT

Accessible high-quality data is the bread and butter of machine learning research,1 and the demand for data has exploded as larger and more advanced ML models are2 built across different domains. Yet, real data often contain sensitive information,3 subject to various biases, and are costly to acquire, which compromise their quality4 and accessibility. Synthetic data have thus emerged as a complement, sometimes5 even a replacement, to real data for ML training. However, the landscape of6 synthetic data research has been fragmented due to the large number of data7 modalities (e.g., tabular data, time series data, images, etc.) and various use cases8 (e.g., privacy, fairness, data augmentation, etc.). This poses practical challenges9 in comparing and selecting synthetic data generators in different problem settings.10 To this end, we develop Synthcity, an open-source Python library that allows11 researchers and practitioners to perform one-click benchmarking of synthetic data12 generators across data modalities and use cases. In addition, Synthcity's plug-in13 style API makes it easy to incorporate additional data generators into the framework.14 Beyond benchmarking, it also offers a single access point to a diverse range of15 cutting-edge data generators. Through examples on tabular data generation and16 data augmentation, we illustrate the general applicability of Synthcity, and the17 insight one can obtain.18

data mining, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre: Research Report (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

0b9536e186a77feff516893a5f393f7a-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 12:48:06 GMT

explanation, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (0.96)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.69)

Add feedback

0b9536e186a77feff516893a5f393f7a-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 12:48:03 GMT

explanation, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Human Computer Interaction (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback

OCCGEN: Selection of Real-world Multilingual Parallel Data Balanced in Gender within Occupations

Neural Information Processing SystemsApr-24-2026, 11:29:24 GMT

This paper describes the OCCGEN toolkit, which allows extracting multilingual parallel data balanced in gender within occupations. OCCGEN can extract datasets that reflect gender diversity (beyond binary) more fairly in society to be further used to explicitly mitigate occupational gender stereotypes. We propose two use cases that extract evaluation datasets for machine translation in four high-resource languages from different linguistic families and in a low-resource African language. Our analysis of these use cases shows that translation outputs in high-resource languages tend to worsen in feminine subsets (compared to masculine), specially in the directions containing English. This is confirmed by the human evaluation. We hypothesize that a sound language generation may contribute to pay less attention to the source sentence and to overgeneralize to the most frequent gender forms.

artificial intelligence, natural language, occupation, (17 more...)

Neural Information Processing Systems

Country:

North America (0.68)
Europe > Spain (0.28)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

00dada608b8db212ea7d9d92b24c68de-Supplemental-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsApr-24-2026, 04:43:43 GMT

artificial intelligence, machine learning, video, (15 more...)

Neural Information Processing Systems

Industry:

Information Technology (0.93)
Education > Curriculum > Subject-Specific Education (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Mobile (0.68)

Add feedback

ALI-Agent: Assessing LLMs' Alignment with Human Values via Agent-based Evaluation

Neural Information Processing SystemsMar-22-2026, 03:30:41 GMT

To mitigate these risks, current evaluation benchmarks predominantly employ expert-designed contextual scenarios to assess how well LLMs align with human values. However, the labor-intensive nature of these benchmarks limits their test scope, hindering their ability to generalize to the extensive variety of open-world use cases and identify rare but crucial long-tail risks. Additionally, these static tests fail to adapt to the rapid evolution of LLMs, making it hard to evaluate timely alignment issues. To address these challenges, we propose ALI-Agent, an evaluation framework that leverages the autonomous abilities of LLM-powered agents to conduct in-depth and adaptive alignment assessments. ALI-Agent operates through two principal stages: Emulation and Refinement.

artificial intelligence, large language model, natural language, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)

Add feedback

HEMM: Holistic Evaluation of Multimodal Foundation Models

Neural Information Processing SystemsMar-20-2026, 09:08:52 GMT

Multimodal foundation models that can holistically process text alongside images, video, audio, and other sensory modalities are increasingly used in a variety of real-world applications. However, it is challenging to characterize and study progress in multimodal foundation models, given the range of possible modeling decisions, tasks, and domains. In this paper, we introduce Holistic Evaluation of Multimodal Models (HEMM) to systematically evaluate the capabilities of multimodal foundation models across a set of 3 dimensions: basic skills, information flow, and real-world use cases. Basic multimodal skills are internal abilities required to solve problems, such as learning interactions across modalities, fine-grained alignment, multi-step reasoning, and the ability to handle external knowledge.

artificial intelligence, machine learning, proceedings, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.38)

Add feedback

0b9536e186a77feff516893a5f393f7a-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 21:55:14 GMT

explanation, explanation method, use case, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Human Computer Interaction (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback

Drink Whole Milk, Eat Red Meat, and Use ChatGPT

The Atlantic - TechnologyFeb-13-2026, 22:27:00 GMT

Robert F. Kennedy Jr. is an AI guy. Last week, during a stop in Nashville on his Take Back Your Health tour, the Health and Human Services secretary brought up the technology between condemning ultra-processed foods and urging Americans to eat protein. "My agency is now leading the federal government in driving AI into all of our activities," he declared. An army of bots, Kennedy said, will transform medicine, eliminate fraud, and put a virtual doctor in everyone's pocket. RFK Jr. has talked up the promise of infusing his department with AI for months.

artificial intelligence, machine learning, natural language, (16 more...)

The Atlantic - Technology

Country: North America > United States (1.00)

Industry: